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Abstract 

Context-free approaches to static analysis gain precision over clas- 
sical approaches by perfectly matching returns to call sites — 
a property that eliminates spurious interprocedural paths. Var- 
doulakis and Shivers's recent formulation of CFA2 showed that it 
is possible (if expensive) to apply context-free methods to higher- 
order languages and gain the same boost in precision achieved over 
first-order programs. 

To this young body of work on context-free analysis of higher- 
order programs, we contribute a pushdown control-flow analy- 
sis framework, which we derive as an abstract interpretation of a 
CESK machine with an unbounded stack. One instantiation of this 
framework marks the first polyvariant pushdown analysis of higher- 
order programs; another marks the first polynomial-time analy- 
sis. In the end, we arrive at a framework for control-flow analysis 
that can efficiently compute pushdown generalizations of classical 
control-flow analyses. 

1. Introduction 

Static analysis is bound by theory to accept diminished precision as 
the price of decidability. The easiest way to guarantee decidability 
for a static analysis is to restrict it to a finite state-space. Not 
surprisingly, finite state-spaces have become a crutch. 

Whenever an abstraction maps an infinite (concrete) state-space 
down to the finite state-space of a static analysis, the pigeon-hole 
principle forces merging. Distinct execution paths and values can 
and do become indistinguishable under abstraction, e.g., 3 and 4 
both abstract to the same value: positive. 

Our message is that finite abstraction goes too far: we can ab- 
stract into an infinite state-space to improve precision, yet remain 
decidable. Specifically, we can abstract the concrete semantics of 
a higher-order language into a pushdown automaton (PDA). As an 
infinite-state system, a PDA-based abstraction preserves more in- 
formation than a classical finite-state analysis. Yet, being less pow- 
erful than a Turing machine, properties important for computing 
control-flow analysis (e.g. emptiness, intersection with regular lan- 
guages, reachability) remain decidable. 

1.1 The problem with merging 

A short example provides a sense of how the inevitable merging 
that occurs under a finite abstraction harms precision. Shivers's 
OCFA [Shivers 1991] produces spurious data-flows and return- 
flows in the following example: 

(let* ((id (lambda (x) x)) 

(a (id 3)) 

(b (id 4))) 
a) 



OCFA says that the flow set for the variable a contains both 3 
and 4. In fact, so does the flow set for the variable b. For return- 
flow, 1 OCFA says that the invocation of (id 4) may return to the 
invocation of (id 3) or (id 4) and vice versa; that is, according 
to Shivers's OCFA, this program contains a loop. 

To combat merging, control-flow analyses have focused on 
increasing context-sensitivity [Shivers 1991]. Context-sensitivity 
tries to qualify any answer that comes back from a CFA with a 
context in which that answer is valid. That is, instead of answer- 
ing "A42 may flow to variable V13," a context-sensitive analysis 
might answer "A42 may flow to variable V13 when bound after call- 
ing /." While context-sensitivity recovers some lost precision, it is 
no silver bullet. A finite-state analysis stores only a finite amount 
of program context to discriminate data- and control-flows during 
analysis. Yet, the pigeon-hole principle remains merciless: as long 
as the state-space is finite, merging is inevitable for some programs. 

Of all the forms of merging, the most pernicious is the merg- 
ing of return-flow information. As the example shows, a finite-state 
control-flow analysis will lose track of where return-points return 
once the maximum bound on context is exceeded. Even in pro- 
grams with no higher-order functions, return-flow merging will still 
happen during control-flow analysis. 

1.2 A first shot: CFA2 

Vardoulakis and Shivers's recent work on CFA2 [Vardoulakis and 
Shivers 2010] constitutes an opening salvo on ending the return- 
flow problem for the static analysis of higher-order programs. 
CFA2 employs an implicit pushdown system that models the stack 
of a program. CFA2 solves the return-flow problem for higher-order 
programs, but it has drawbacks: 

1 . CFA2 allows only monovariant precision. 

2. CFA2 has exponential complexity in the size of the program. 

3. CFA2 is restricted to continuation-passing style. 

Our solution overcomes all three drawbacks: it allows polyvari- 
ant precision, we can widen it to 0(n 6 )-time complexity in the 
monovariant case and we can operate on direct-style programs. 

1.3 Our solution: Abstraction to pushdown systems 

To prevent return-flow merging during higher-order control-flow 
analysis, we abstract into an explicit pushdown system instead of a 
finite-state machine. The program stack, which determines return- 
flow, will remain unbounded and become the pushdown stack. As a 
result, return-flow information will never be merged: in the abstract 
semantics, a function returns only whence it was called. 
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1 "Return-flow" analysis asks to which call sites a given return point may 
return. In the presence of tail calls, which break the balance between calls 
and returns, return-flow analysis differs from control-flow analysis. 



1.4 Overview 

This paper is organized as follows: first, we define a variant of the 
CESK machine [Felleisen and Friedman 1987] for the A-Normal 
Form A-calculus [Flanagan et al. 1993]. In performing analysis, we 
wish to soundly approximate intensional properties of this machine 
when it evaluates a given program. To do so, we construct an ab- 
stract interpretation of the machine. The abstracted CESK machine 
operates much like its concrete counterpart and soundly approxi- 
mates its behavior, but crucially, many properties of the concrete 
machine that are undecidable become decidable when considered 
against the abstracted machine (e.g. "is a given machine configura- 
tion reachable?" becomes a decidable property). 

The abstract counterpart to the CESK machine is constructed 
by bounding the store component of the machine to some finite 
size. However, the stack component (represented as a continuation) 
is left unabstracted. (This is in contrast to finite-state abstractions 
that store-allocate continuations [Van Horn and Might 2010].) Un- 
like most higher-order abstract interpreters, the unbounded stack 
implies this machine has a potentially infinite set of reachable ma- 
chine configurations, and therefore enumerating them is not a fea- 
sible approach to performing analysis. 

Instead, we demonstrate how properties can be decided by trans- 
forming the abstracted CESK machine into an equivalent push- 
down automaton. We then reduce higher-order control-flow anal- 
ysis to deciding the non-emptiness of a language derived from the 
PDA. (This language happens to be the intersection of a regular lan- 
guage and the context-free language described by the PDA.) This 
approach — though concise, precise and decidable — is formidably 
expensive, with complexity doubly exponential in the size of the 
program. 

We simplify the algorithm to merely exponential in the size of 
the input program by reducing the control-flow problem to push- 
down reachability [Bouajjani et al. 1997]. Unfortunately, the ab- 
stracted CESK machine has an exponential number of control states 
with respect to the size of the program. Thus, pushdown reachabil- 
ity for higher-order programs appears to be inherently exponential. 

Noting that most control states in the abstracted CESK machine 
are actually unreachable, we present a fixed-point algorithm for de- 
ciding pushdown reachability that is polynomial-time in the num- 
ber of reachable control states. Since the pushdown systems pro- 
duced by abstracted CESK machines are sparse, such algorithms, 
though exponential in the worst case, are reasonable options. Yet, 
we can do better. 

Next, we add an e-closure graph (a graph encoding no-stack- 
change reachability) and a work-list to the fixed-point algorithm. 
Together, these lower the cost of finding the reachable states of a 
pushdown system from 0(|F| m°) to 0(|F| m ), where V is the 
stack alphabet and m is the number of reachable control states. 

To drop the complexity of our analysis to polynomial-time in 
the size of the input program, we must resort to both widening and 
monovariance. Widening with a single-threaded store and using 
a monovariant allocation strategy yields a pushdown control-flow 
analysis with polynomial-time complexity, at 0(n 6 ), where n is 
the size of the program. 

Finally, we briefly highlight applications of pushdown control- 
flow analyses that are outside the reach of classical ones, discuss 
related work, and conclude. 



2. Pushdown preliminaries 

In this work, we make use of both pushdown systems and push- 
down automata. (A pushdown automaton is a specific kind of push- 
down system.) There are many (equivalent) definitions of these 
machines in the literature, so we adapt our own definitions from 



[Sipser 2005]. Even those familiar with pushdown theory may want 
to skim this section to pick up our notation. 

2.1 Syntactic sugar 

When a triple (x,£,x') is an edge in a labeled graph, a little 
syntactic sugar aids presentation: 

Similarly, when a pair (x, x') is a graph edge: 

We use both string and vector notation for sequences: 

aia 2 ...a n = {ai,a 2 ,. . . ,a n ) = a. 

2.2 Stack actions, stack change and stack manipulation 

Stacks are sequences over a stack alphabet I\ Pushdown systems 
do much stack manipulation, so to represent this more concisely, 
we turn stack alphabets into "stack-action" sets; each character 
represents a change to the stack: push, pop or no change. 

For each character 7 in a stack alphabet T, the stack-action set 
r± contains a push character 7+ and a pop character 7- ; it also 
contains a no-stack-change indicator, e: 

g G r± ::= e [stack unchanged] 

I 7+ for each 7 G T [pushed 7] 

I 7_ for each 7 G T [popped 7]. 

In this paper, the symbol g represents some stack action. 

2.3 Pushdown systems 

A pushdown system is a triple M — (Q, V, S) where: 

1. Q is a finite set of control states; 

2. r is a stack alphabet; and 

3. S C Q x r± x Q is a transition relation. 

We use PBS to denote the class of all pushdown systems. 

Unlike the more widely known pushdown automaton, a push- 
down system does not recognize a language. 

For the following definitions, let M = (Q, V, S). 

• The configurations of this machine — Configs(M) — are pairs 
over control states and stacks: 

Configs{M) = Q x I\* 

• The labeled transition relation (1 — >m) C Configs(M) x 
r± x Configs(Al) determines whether one configuration may 
transition to another while performing the given stack action: 



(<7.7) 
(<7,7:7) 



If unlabelled, the transition relation ( 
stack action can enable the transition: 



(q , 7) iff q >—* q G S [no change] 
(q',j) iff q^->q G S [pop] 
[push], 
checks whether any 
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(<7.7) >— * (g',7 : 7) iff g >-►<?' e <5 



c 1 — > c iff c 1 — > c for some stack action 1 

M M 



For a string of stack actions g\ 
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for some configurations Co , . 



• For the transitive closure: 

c i — > c iff c i — > c for some action string q . 

MM bi) 

Note Some texts define the transition relation 8 so that 8 C 
QxTxQxr*. In these texts, (q,7, g',7) G 5 means, "if in control 
state g while the character 7 is on top, pop the stack, transition 
to control state q' and push 7." Clearly, we can convert between 
these two representations by introducing extra control states to our 
representation when it needs to push multiple characters. 

2.4 Rooted pushdown systems 

A rooted pushdown system is a quadruple (Q, F, 8, qo) in which 
(Q, r, 8) is a pushdown system and go G Q is an initial (root) state. 
RPDS is the class of all rooted pushdown systems. 

For a rooted pushdown system M — (Q, T, <5, go), we define a 
the root-reachable transition relation: 

ci — >-> c iff (an, ()) 1 — > c and c 1 — > c . 

M M M 

In other words, the root-reachable transition relation also makes 
sure that the root control state can actually reach the transition. 

We overload the root-reachable transition relation to operate on 
control states as well: 



requires that all arguments to a function be atomic: 

e G Exp ::= (let ((u call)) e) [non-tail call] 



g^g'rff(g,7)^ 



(g , 7 ) for some stacks 



7,7 



For both root-reachable relations, if we elide the stack-action label, 
then, as in the un-rooted case, the transition holds if there exists 
some stack action that enables the transition: 



— H-g'iffqi-^ 

M M 



2.5 Pushdown automata 



g for some action g. 



A pushdown automaton is a generalization of a rooted pushdown 
system, a 7-tuple (Q, E, T, 8, go , F, 7) in which: 

1. E is an input alphabet; 

2. 8 C Q x r± x (E U {e}) x Q is a transition relation; 

3. F C Q is a set of accepting states; and 

4. 7 6 T* is the initial stack. 

We use PDA to denote the class of all pushdown automata. 

Pushdown automata recognize languages over their input al- 
phabet. To do so, their transition relation may optionally con- 
sume an input character upon transition. Formally, a PDA M = 
(Q, E, F, 8, go, F, 7) recognizes the language C(M) C E*: 

ee£(Af)ifgo eF 

aw G C(M) if 5(qo,j+,a,q') andui G C(Q, E, T, 8, q , F, 7 : 7) 

aw G C(M) if 8(qo, e, a,q) and w G C(Q, E, Y, 8, q . F, 7) 

aw G C(M) if8(qo,J-,a,q) and w G C(Q, E, T, 8, q , F, 7') 

where 7 = {7, 72, . . . , 7») and 7 = (72, . . . , 7„), 

where a is either the empty string e or a single character. 

3. Setting: A-Normal Form A-calculus 

Since our goal is to create pushdown control-flow analyses of 
higher-order languages, we choose to operate on the A-calculus. 
For simplicity of the concrete and abstract semantics, we choose 
to analyze programs in A-Normal Form, however this is strictly a 
cosmetic choice; all of our results can be replayed mutatis mutandis 
in a direct-style setting. ANF enforces an order of evaluation and it 



call 


[tail call] 


1 * 


[return] 


/, be G Atom ::= v \ lam 


[atomic expressions] 


lam G Lam ::= (A (v) e) 


[lambda terms] 


call G Call ::= (/ x) 


[applications] 


v G Var is a set of identifiers 


[variables] . 



We use the CESK machine of Felleisen and Friedman [1987] to 
specify the semantics of ANF. We have chosen the CESK machine 
because it has an explicit stack, and under abstraction, the stack 
component of our CESK machine will become the stack component 
of a pushdown system. 

First, we define a set of configurations (Conf) for this machine: 

c G Conf — Exp x Env x Store x Kont [configurations] 



[environments] 
[stores] 
[closures] 
[continuations] 
[stack frames] 
[addresses] . 



p G Env — Var — * Addr 
a G Store — Addr — > Clo 
clo G Clo = Lamx Env 
K G Kont — Frame* 
4> G Frame — Var x Exp x Env 
a G Addr is an infinite set of addresses 
To define the semantics, we need five items: 

1. I : Exp — > Conf injects an expression into a configuration. 

2. A : Atom x Env x Store — »■ Clo evaluates atomic expressions. 

3. £ : Exp — > V (Conf) computes the set of reachable machine 
configurations for a given program. 

4. (=>) C Conf x Conf transitions between configurations. 

5. alloc : Var x Conf -^ Addr chooses fresh store addresses for 
newly bound variables. 

Program injection The program injection function pairs an ex- 
pression with an empty environment, an empty store and an empty 
stack to create the initial configuration: 

co =I(e) = (e, [],[],<». 

Atomic expression evaluation The atomic expression evaluator, 
A : Atom x Env x Store — *■ Clo, returns the value of an atomic 
expression in the context of an environment and a store: 

A(lam, p,a) — (lam, p) [closure creation] 

A(v, p, <r) = o(p(v)) [variable look-up]. 

Reachable configurations The evaluator £ : Exp — > V (Conf) 
returns all configurations reachable from the initial configuration: 

£(e) = {c:X(e) ^* c} . 

Transition relation To define the transition c => c', we need three 
rules. The first rule handles tail calls by evaluating the function into 
a closure, evaluating the argument into a value and then moving to 
the body of the A-term within the closure: 



([(/ «)],/>, <t,k) => (e,p",a',K), where 
(l(\(v)e)lp') = A(f,p,a) 

a = alloc(v, c) 

" 'r 1 

p = p [v 1— > a\ 

a' = o[a H> A(se, p, a)]. 



Non-tail call pushes a frame onto the stack and evaluates the call: 



c £ Conf = Exp x Env x Store x Kont [configurations] 



([(let ( (v call) ) e)], p, a, «) =4- (ca/Z, p, a, (t), e, p) : «) 
Function return pops a stack frame: 



(a?, p, a, (v, e, p ) : k) =*• (e, p , <7 , k) , where 

a = alloc(v, c) 
p = p [w i->- a] 
a' = <r[a i-> A(se, p, a)]. 

Allocation The address-allocation function is an opaque param- 
eter in this semantics. We have done this so that the forthcoming 
abstract semantics may also parameterize allocation, and in so do- 
ing provide a knob to tune the polyvariance and context-sensitivity 
of the resulting analysis. For the sake of defining the concrete se- 
mantics, letting addresses be natural numbers suffices, and then the 
allocator can choose the lowest unused address: 

Addr = N 

alloc(v, (e, p, a, k)) = 1 + max(dom(<r)). 

4. An infinite-state abstract interpretation 

Our goal is to statically bound the higher-order control-flow of the 
CESK machine of the previous section. So, we are going to conduct 
an abstract interpretation. 

Since we are concerned with return-flow precision, we are go- 
ing to abstract away less information than we normally would. 
Specifically, we are going to construct an infinite-state abstract in- 
terpretation of the CESK machine by leaving its stack unabstracted. 
(With an infinite-state abstraction, the usual approach for comput- 
ing the static analysis — exploring the abstract configurations reach- 
able from some initial configuration — simply will not work. Sub- 
sequent sections focus on finding an algorithm that can compute a 
finite representation of the reachable abstract configurations of the 
abstracted CESK machine.) 

For the abstract interpretation of the CESK machine, we need 
an abstract configuration-space (Figure 1). To construct one, we 
force addresses to be a finite set, but crucially, we leave the stack 
untouched. When we compact the set of addresses into a finite 
set, the machine may run out of addresses to allocate, and when 
it does, the pigeon-hole principle will force multiple closures to 
reside at the same address. As a result, we have no choice but to 
force the range of the store to become a power set in the abstract 
configuration-space. To construct the abstract transition relation, 
we need five components analogous to those from the concrete 
semantics. 

Program injection The abstract injection function I : Exp — > 
Conf pairs an expression with an empty environment, an empty 
store and an empty stack to create the initial abstract configuration: 

£o =±(e) = (e, [],[],<))■ 
Atomic expression evaluation The abstract atomic expression 
evaluator, A : Atom x Env x Store — >■ V(Clo), returns the value 
of an atomic expression in the context of an environment and a 
store: note how it returns a set: 



A(lam, p, a) — {(lam, p)} 
A(v,p,a) = B(p(v)) 



[closure creation] 
[variable look-up] . 



p € Env = Var — >■ Addr [environments] 

a £ Store = Addr ->■ V (do) [stores] 

clo 6 Clo = Lam x Env [closures] 

k £ Kont = Frame [continuations] 

<f> £ Frame = Var x Exp x Env [stack frames] 

a 6 Addr is a finite set of addresses [addresses]. 

Figure 1. The abstract configuration-space. 

the initial configuration: 

£(e) = |c:X(e)~»* cX . 

Because there are an infinite number of abstract configurations, 
a naive implementation of this function may not terminate. In 
Sections 5 through 8, we show that there is a way to compute a 
finite representation of this set. 

Transition relation The abstract transition relation (~») C 
Conf x Conf has three rules, one of which has become non- 
deterministic. A tail call may fork because there could be multiple 
abstract closures that it is invoking: 



([(/ se)],p, a, k) ~> (e, p", a , k) , where 
(ia(v)e)lp')eAlf,p,a) 
a = alloc(v, c) 
p — p [v h-> a\ 
a =iu[art A(x, p, a)]. 
We define all of the partial orders shortly, but for stores: 

(a U o')(a) - a (a) U o'(a). 
A non-tail call pushes a frame onto the stack and evaluates the call: 



([(let ((t> call)) e)],p, a, k) ~> (call, p, a, (v, e, p) : k) . 
A function return pops a stack frame: 



Reachable configurations The abstract program evaluator £ : 
Exp — > V(Conf) returns all of the configurations reachable from 



(se, p, a, (v, e, p) : k) ~» (e, p", a , k) , where 
a = alloc(v, c) 
p = p [v i->- a\ 
a = a U [a H* A(se, p, a)]. 

Allocation, polyvariance and context-sensitivity In the abstract 
semantics, the abstract allocation function alloc : Var x Conf — > 
Addr determines the polyvariance of the analysis (and, by exten- 
sion, its context-sensitivity). In a control-flow analysis, polyvari- 
ance literally refers to the number of abstract addresses (variants) 
there are for each variable. By selecting the right abstract allocation 
function, we can instantiate pushdown versions of classical flow 
analyses. 

Monovariance: Pushdown OCFA Pushdown OCFA uses variables 
themselves for abstract addresses: 



Addr — Var 

alloc(v, c) = v. 

Context-sensitive: Pushdown 1CFA Pushdown 1CFA pairs the 
variable with the current expression to get an abstract address: 

Addr = Var x Exp 

alloc(v, (e, p, a, ft)) = (v, e). 

Polymorphic splitting: Pushdown poly/CFA Assuming we com- 
piled the program from a programming language with let-bound 
polymorphism and marked which functions were let-bound, we can 
enable polymorphic splitting: 

Addr — Var + Var x Exp 

(«,[(/«)]) /is let-bound 



alloc(v, ([(/ x)],p,6,k)) = 



otherwise. 



Pushdown k-CFA For pushdown fc-CFA, we need to look beyond 
the current state and at the last k states. By concatenating the 
expressions in the last k states together, and pairing this sequence 
with a variable we get pushdown fc-CFA: 

Addr = Var x Exp* 

alloc(v, ((ei, pi, (Xi, Hi), ■••)) = (v, (ei, ..., e k )). 

4.1 Partial orders 

For each set X inside the abstract configuration-space, we use the 
natural partial order, (Ex) C X x X. Abstract addresses and 
syntactic sets have flat partial orders. For the other sets: 

• The partial order lifts point-wise over environments: 

P E P iff p(v) = p (v) for all v G dom(p). 

• The partial orders lifts component-wise over closures: 

(lam, p) E (lam, p ) iff p E p . 

• The partial order lifts point-wise over stores: 

& E a iff o(a) E a (a) for all a € dom(a). 

• The partial order lifts component-wise over frames: 

(v,e,p) E (v,e,p)iffpC p. 

• The partial order lifts element-wise over continuations: 

(4>1,... An) E (<f>'l,--- ,</>«) iff 0! E V^■ 

• The partial order lifts component-wise across configurations: 
(e,p,a,k) E (e,p,a , k ) iff p E P and cr C <r and ft C k . 

4.2 Soundness 

To prove soundness, we need an abstraction map a that connects 
the concrete and abstract configuration-spaces: 

a(e,p,o,n) = (e,a(p),a(o),a(n)) 

a(p) = \v.a(p(v)) 

a(a) = Xa. [J {a(a(a))} 

a.(a) = a 

a(4>i,. ..,</>„) = (q(0i), . . . , a(0n)) 
a(v,e,p) = (v,e,a(p)) 

a(a) is determined by the allocation functions. 



P£>.4(e) = (Q, E, r, <5, q , F, (>), where 

Q = Exp x Env x Store 
S = Q 
r = Frame 
(q,t,q ,q) G <5 iff (g, ft) ~» (g , ft) for all ft 
(g, (j>-,q , q) G 5 iff (q, cj> '■ k) ~» (g', ft) for all ft 
(g, + , g , g ) G <5 iff (g, ft) ~> (g , : ft) for all ft 
(»,()) =i(e) 



Figure 2. PO^ : Exp 



It is easy to prove that the abstract transition relation simulates 
the concrete transition relation: 

Theorem 4.1. // 

a(c) E cand c =^> c , 
then there must exist c' G Conf such that: 

a(c )Cc and c =>• c . 

Proof. The proof follows by case-wise analysis on the type of the 
expression in the configuration. It is a straightforward adaptation of 
similar proofs, such as that of Might [2007] for fc-CFA. □ 

5. From the abstracted CESK machine to a PDA 

In the previous section, we constructed an infinite-state abstract 
interpretation of the CESK machine. The infinite-state nature of the 
abstraction makes it difficult to see how to answer static analysis 
questions. Consider, for instance, a control flow-question: 

At the call site (/ ss) , may a closure over lam be called? 

If the abstracted CESK machine were a finite-state machine, an 
algorithm could answer this question by enumerating all reach- 
able configurations and looking for an abstract configuration 
([(/ se)\, p, a", ft) in which (lam, J) G A(f, p, o). However, be- 
cause the abstracted CESK machine may contain an infinite number 
of reachable configurations, an algorithm cannot enumerate them. 

Fortunately, we can recast the abstracted CESK as a special 
kind of infinite-state system: a pushdown automaton (PDA). Push- 
down automata occupy a sweet spot in the theory of computation: 
they have an infinite configuration-space, yet many useful proper- 
ties {e.g. word membership, non-emptiness, control-state reacha- 
bility) remain decidable. Once the abstracted CESK machine be- 
comes a PDA, we can answer the control-flow question by check- 
ing whether a specific regular language, when intersected with the 
language of the PDA, turns into the empty language. 

The recasting as a PDA is a shift in perspective. A configura- 
tion has an expression, an environment and a store. A stack char- 
acter is a frame. We choose to make the alphabet the set of control 
states, so that the language accepted by the PDA will be sequences 
of control-states visited by the abstracted CESK machine. Thus, 
every transition will consume the control-state to which it transi- 
tioned as an input character. Figure 2 defines the program-to-PDA 
conversion function VDA : Exp — > PDA. (Note the implicit use 
of the isomorphism Q x Kont — Conf.) 

At this point, we can answer questions about whether a speci- 
fied control state is reachable by formulating a question about the 



intersection of a regular language with a context-free language de- 
scribed by the PDA. That is, if we want to know whether the control 
state (e', p, a) is reachable in a program e, we can reduce the prob- 
lem to determining: 

E* ■ {(e,p,a)} ■ E* n C(VVA{e)) ± 0, 
where L\ ■ L2 is the concatenation of formal languages L\ and L-z- 
Theorem 5.1. Control-state reachability is decidable. 

Proof. The intersection of a regular language and a context-free 
language is context-free. The emptiness of a context-free language 
is decidable. □ 

Now, consider how to use control-state reachability to answer 
the control-flow question from earlier. There are a finite number of 
possible control states in which the A-term lam may flow to the 
function / in call site (/ se) ; let's call the this set of states S: 



S = 



{([(/ «)], P, o-) : {lam, pi) € A{f,p,a) for some p'j . 

What we want to know is whether any state in the set S is reachable 
in the PDA. In effect what we are asking is whether there exists a 
control state q ' S such that: 

E* ■ {q} • E* n C{VDA{e)) + 0. 

If this is true, then lam may flow to /; if false, then it does not. 

Problem: Doubly exponential complexity The non-emptiness- 
of-intersection approach establishes decidability of pushdown 
control-flow analysis. But, two exponential complexity barriers 
make this technique impractical. 

First, there are an exponential number of both environments 



(\Addr 



Var 



) and stores (2 



Clo\x\Addr\ 



) to consider for the set S. On 



top of that, computing the intersection of a regular language with a 
context-free language will require enumeration of the (exponential) 
control-state-space of the PDA. As a result, this approach is doubly 
exponential. For the next few sections, our goal will be to lower the 
complexity of pushdown control-flow analysis. 

6. Focusing on reachability 

In the previous section, we saw that control-flow analysis reduces 
to the reachability of certain control states within a pushdown sys- 
tem. We also determined reachability by converting the abstracted 
CESK machine into a PDA, and using emptiness-testing on a lan- 
guage derived from that PDA. Unfortunately, we also found that 
this approach is deeply exponential. 

Since control-flow analysis reduced to the reachability of 
control-states in the PDA, we skip the language problems and go 
directly to reachability algorithms of Bouajjani et al. [1997], Ko- 
dumal and Aiken [2004], Reps [1998] and Reps et al. [2005] that 
determine the reachable configurations within a pushdown system. 
These algorithms are even polynomial-time. Unfortunately, some 
of them are polynomial-time in the number of control states, and 
in the abstracted CESK machine, there are an exponential number 
of control states. We don't want to enumerate the entire control 
state-space, or else the search becomes exponential in even the best 
case. 

To avoid this worst-case behavior, we present a straightforward 
pushdown-reachability algorithm that considers only the reachable 
control states. We cast our reachability algorithm as a fixed-point 
iteration, in which we incrementally construct the reachable subset 
of a pushdown system. We term these algorithms "iterative Dyck 
state graph construction." 

A Dyck state graph is a compacted, rooted pushdown system 
G = {S, T,E, so), in which: 



1. 5 is a finite set of nodes; 

2. r is a set of frames; 

3. E C S x r± x S is a set of stack-action edges; and 

4. so is an initial state; 

such that for any node s £ S, it must be the case that: 

(so, ()) 1 — > (s, 7) for some stack 7. 

In other words, a Dyck state graph is equivalent to a rooted push- 
down system in which there is a legal path to every control state 
from the initial control state. 2 

We use DSG to denote the class of Dyck state graphs. Clearly: 

DSG C RPDS. 

A Dyck state graph is a rooted pushdown system with the "fat" 
trimmed off; in this case, unreachable control states and unreach- 
able transitions are the "fat." 

We can formalize the connection between rooted pushdown 
systems and Dyck state graphs with a map: 

VSQ : RPDS ->■ DSG. 

Given a rooted pushdown system M = (Q, F, S,qo), its equivalent 
Dyck state graph is VSG(M) = {S, T, E, q ), where the set S 
contains reachable nodes: 



S = 



Iq : {qo, ()) ^ {q, 7) for some stack 7 J . 



and the set E contains reachable edges: 



E = 



9 1 
q^q 



7*° 



'}■ 



and so = qo. 

In practice, the real difference between a rooted pushdown sys- 
tem and a Dyck state graph is that our rooted pushdown system will 
be defined intensionally (having come from the components of an 
abstracted CESK machine), whereas the Dyck state graph will be 
defined extensionally, with the contents of each component explic- 
itly enumerated during its construction. 

Our near-term goals are (1) to convert our abstracted CESK 
machine into a rooted pushdown system and (2) to find an efficient 
method for computing an equivalent Dyck state graph from a rooted 
pushdown system. 

To convert the abstracted CESK machine into a rooted push- 
down system, we use the function TZVDS : Exp — > RPDS: 

1ZPVS{e) = {Q,r,S,q ) 

Q = Exp x Env x Store 

T = Frame 



->q £ J iff {q, k) 

-» q £ J iff (q, (j> 

-» q £ J iff {q, k) 
fo,,<))=±(e). 



~> {q , k) for all k 
k) -v* {q , k) for all k 
~~* {q ,4> : k) for all k 



7. Compacting rooted pushdown systems 

We now turn our attention to compacting a rooted pushdown system 
(defined intensionally) into a Dyck state graph (defined extension- 



~We chose the term Dyck state graph because the sequences of stack 
actions along valid paths through the graph correspond to substrings in 
Dyck languages. A Dyck language is a language of balanced, "colored" 
parentheses. In this case, each character in the stack alphabet is a color. 



ally). That is, we want to find an implementation of the function 
T)SQ. To do so, we first phrase the Dyck state graph construction 
as the least fixed point of a monotonic function. This will provide a 
method (albeit an inefficient one) for computing the function T>SQ. 
In the next section, we look at an optimized work-list driven algo- 
rithm that avoids the inefficiencies of this version. 

The function T : MPDS ->• (BSG ->■ DSG) generates the 
monotonic iteration function we need: 



T{M) = /, where 
M = (Q,F,S,q ) 
f(S, T, E, so) = (S', T, E', S() ), where 

S' = Su(s':s£ Sands.— ^> 

I M 

B' = Bu[sHs':s£Sands 



3'}U{S(,} 
M J 



Given a rooted pushdown system M , each application of the func- 
tion J- (At) accretes new edges at the frontier of the Dyck state 
graph. Once the algorithm reaches a fixed point, the Dyck state 
graph is complete: 

Theorem 7.1. T>SQ{M) = lfp(.F(M)). 



Proof. Let M = (Q,T,5,q ). Let / = F{M). Observe that 
lfp(/) = /"(0, T, 0, g ) for some n. When N C M, then it easy 
to show that f(N) C M. Hence, VSG(M) D lfp(J"(M)). 

To show T>SG(M) C lfp(.F(M)), suppose this is not the case. 
Then, there must be at least one edge in DSQ(M) that is not in 
lfp(.F(M)). Let (s,g, s') be one such edge, such that the state s 
is in lfp(J r (Af )). Let m be the lowest natural number such that s 
appears in f m (M). By the definition of /, this edge must appear in 
/ (M), which means it must also appear in lfp(.F(Af )), which 
is a contradiction. Hence, VSQ{M) C lfp(.F(M)). □ 



7.1 Complexity: Polynomial and exponential 

To determine the complexity of this algorithm, we ask two ques- 
tions: how many times would the algorithm invoke the iteration 
function in the worst case, and how much does each invocation cost 
in the worst case? The size of the final Dyck state graph bounds the 
run-time of the algorithm. Suppose the final Dyck state graph has 
m states. In the worst case, the iteration function adds only a single 
edge each time. Since there are at most 2|F|m 2 + m 2 edges in the 
final graph, the maximum number of iterations is 2|r|?n 2 + m 2 . 

The cost of computing each iteration is harder to bound. The 
cost of determining whether to add a push edge is proportional to 
the size of the stack alphabet, while the cost of determining whether 
to add an e-edge is constant, so the cost of determining all new push 
and pop edges to add is proportional to \T\m + m. Determining 
whether or not to add a pop edge is expensive. To add the pop edge 
s m 7 - s', we must prove that there exists a configuration-path to 
the control state s, in which the character 7 is on the top of the 
stack. This reduces to a CFL-reachability query [Melski and Reps 
2000] at each node, the cost of which is 0(|F±| rn ) [Kodumal 
and Aiken 2004]. 

To summarize, in terms of the number of reachable control 
states, the complexity of the most recent algorithm is: 

0((2|F|m 2 + m 2 ) x (|r|m + m + |r±| 3 m 3 )) 



o(|r| 4 m 5 



While this approach is polynomial in the number of reachable 
control states, it is far from efficient. In the next section, we provide 
an optimized version of this fixed-point algorithm that maintains a 
work-list and an e-closure graph to avoid spurious recomputation. 



8. Efficiency: Work-lists and e-closure graphs 

We have developed a fixed-point formulation of the Dyck state 
graph construction algorithm, but found that, in each iteration, it 
wasted effort by passing over all discovered states and edges, even 
though most will not contribute new states or edges. Taking a cue 
from graph search, we can adapt the fixed-point algorithm with a 
work-list. That is, our next algorithm will keep a work-list of new 
states and edges to consider, instead of reconsidering all of them. 
In each iteration, it will pull new states and edges from the work 
list, insert them into the Dyck state graph and then populate the 
work-list with new states and edges that have to be added as a 
consequence of the recent additions. 

8.1 e-closure graphs 

Figuring out what edges to add as a consequence of another edge 
requires care, for adding an edge can have ramifications on distant 
control states. Consider, for example, adding the e-edge q >^ e q' 
into the following graph: 



Qo 



■*-Q 



-5-?l 



As soon this edge drops in, an e-edge "implicitly" appears between 
q and q 1 because the net stack change between them is empty; the 
resulting graph looks like: 



qo 



>5 



N 
■ q\ 



where we have illustrated the implicit e-edge as a dotted line. 

To keep track of these implicit edges, we will construct a sec- 
ond graph in conjunction with the Dyck state graph: an e-closure 
graph. In the e-closure graph, every edge indicates the existence of 
a no-net-stack-change path between control states. The e-closure 
graph simplifies the task of figuring out which states and edges are 
impacted by the addition of a new edge. 

Formally, an e-closure graph, is a pair G ( — (N, H), where N 
is a set of states, and H C N x N is a set of edges. Of course, all 
e-closure graphs are reflexive: every node has a self loop. We use 
the symbol ECG to denote the class of all e-closure graphs. 

We have two notations for finding ancestors and descendants of 
a state in an e-closure graph G € = (N, H): 



G l [s] = {s :(s',s)eH} 
^ e [s] = {s':(s,s')eff} 



[ancestors] 
[descendants] . 



8.2 Integrating a work-list 

Since we only want to consider new states and edges in each 
iteration, we need a work-list, or in this case, two work-graphs. 
A Dyck state work-graph is a pair (AS, AE) in which the set AS 
contains a set of states to add, and the set AE contains edges to be 
added to a Dyck state graph. 3 We use ADSG to refer to the class 
of all Dyck state work-graphs. 

An e-closure work-graph is a set AH of new e-edges. We use 
AECG to refer to the class of all e-closure work-graphs. 

8.3 A new fixed-point iteration-space 

Instead of consuming a Dyck state graph and producing a Dyck 
state graph, the new fixed-point iteration function will consume and 
produce a Dyck state graph, an e-closure graph, a Dyck state work- 
graph and an e-closure work graph. Hence, the iteration space of 



3 Technically, a work-graph is not an actual graph, since AE 2 AS 1 X 
T± X AS 1 ; a work-graph is just a set of nodes and a set of edges. 



F'(M) = /, where 
M = (Q,r,S,q ) 
f(G, G £ , AG, AH) = (<?', G' e , AG', AH' - H), where 
(S,T,E,8 ) = G 

(S, H) = G, 
(AS, AE) = AG 

(AE ,AH ) = (J sproui M (s) 

sGAS 

(A£i,AHi) = (J addP^h M (G,G E )(s,7+,s') 

(s,7+,s')eAB 

(A£ 2 ,AH 2 ) = (J arfdPop M (G,G,)(s,7-,s') 

(s,7_,s')6AB 

(AE 3 ,AH 3 )= (J addEmpty M (G,G e )(s,s') 

(s,f,s')£AS 

(AE 4 ,AH±)= \J addEmpty M (G,G e )(s,s') 

(s,s')eAH 

S' = SUAS 

E' = EUAE 

H' = H U AH 
AS' = A£o U ABi U AS 2 U A£ 3 U A£ 4 
AS' = {*':(*,<,, a') e AS'} 
AH' = AHo U AHi U AH 2 U AH 3 U Aft 

G' = (SuAS,r,£', go ) 

G^ = (5',H') 
AG' = (AS' - S', AE' - £'). 

Figure 3. The fixed point of the function T 1 (At) contains the 
Dyck state graph of the rooted pushdown system At. 



the new algorithm is: 

IDSG = BSG x ECG x ADSG x AECG 
(The / in IDSG stands for intermediate.) 

8.4 The e-closure graph work-list algorithm 

The function T : MPDS -> (IDSG -> IDSG) generates the 
required iteration function (Figure 3). Please note that we implicitly 
distribute union across tuples: 

(X, Y) U (X', Y') = (XuX,YU Y'). 

The functions sprout, addPush, addPop, addEmpty calculate 
the additional the Dyck state graph edges and e-closure graph edges 
(potentially) introduced by a new state or edge. 

Sprouting Whenever a new state gets added to the Dyck state 
graph, the algorithm must check whether that state has any new 
edges to contribute. Both push edges and e-edges do not depend 
on the current stack, so any such edges for a state in the pushdown 
system's transition function belong in the Dyck state graph. The 
sprout function: 

sprout {QStS) :Q->(P(8)xP(Qx Q)), 



checks whether a new state could produce any new push edges or 
no-change edges. We can represent its behavior diagrammatically: 



6 ..■■■ '■■.. t+ 

/* '\ 

q' q" 

which means if adding control state s: 
add edge s >-» e q' if it exists in 8, and 
add edge s ^ 7 + q" if it exists in 8. 

Formally: 

sprout, q r s \(s) = (AE, AH), where 






AH 



q : s 



q: s 



q G 5} U I 

q es}. 



7+ 7+ 

s y-^ q : s >— > q £ 



Considering the consequences of a new push edge Once our 
algorithm adds a new push edge to a Dyck state graph, there is 
a chance that it will enable new pop edges for the same stack frame 
somewhere downstream. If and when it does enable pops, it will 
also add new edges to the e-closure graph. The addPush function: 

addPush iQi r,s) ■ D§G x ECG -^ 8 ^ (P (8) x P (Q x Q)), 

checks for e-reachable states that could produce a pop. We can 
represent this action diagrammatically: 



Hr- 1 -^^ 



A 



which means if adding push-edge s i-^ 7 + q: 
if pop-edge q ^ 1 - q" is in 8, then 

add edge q ^ 7 - q" , and 

add e-edge s >— > q" . 
Formally: 

addPush {QiT , s) (G,G t )(s £ q) = (AS, AH), where 

AE = L' £ q" : q' G 7$ t [q] and q' £ q" G s] 



AH 



Considering the consequences of a new pop edge Once the al- 
gorithm adds a new pop edge to a Dyck state graph, it will create at 
least one new e-closure graph edge and possibly more by matching 
up with upstream pushes. The addPop function: 



q" : q' e G e [q] and q >— » q" G 8 > . 



addPop, 



JSG x ECG -^8^(P(8)xP(Qx Q)), 



J (Q,r,s) 

checks for e-reachable push-edges that could match this pop-edge. 
We can represent this action diagrammatically: 



7+ 



^K^p 



which means if adding pop-edge s'" 



if push-edge s 



s' is already in the Dyck state graph, then 



add e-edge s ^ q. 
Formally: 

addPop (Qrs) (G,G e ){s 



Theorem 8.1. lfp(-F'(M)) = {VSQ{M),G 



e, yv, vj, \ 



= (AE, AH), where 



AE = and AH = <^ s >-> g : s' G G e [s"] ands^s'eG 



Considering the consequences of a new e-edge Once the algo- 
rithm adds a new e-closure graph edge, it may transitively have to 
add more e-closure graph edges, and it may connect an old push to 
(perhaps newly enabled) pop edges. The addEmpty function: 

addEmpty ,q r $-, : 

BSG x ECG -> (Q x Q) -*■ (V (5) xV(Qx Q)), 

checks for newly enabled pops and e-closure graph edges: Once 
again, we can represent this action diagrammatically: 





. ■ ■ e 


.'■■■. € 




7+ ' 


e -P 7 ! 


e -f^ 


. N 7- 

T^ s"" ■■-> 1 

A A 




e > l S \ 


I 8 1 






€ 








e 




L 




e 


J 



which means if adding e-edge s" >— > s'": 
if pop-edge s"" ^ 7 ~ g is in 8, then 
add e-edge s ^ q; and 



add edge s'' 



s , s 



s"", and s' 



add e-edges s' 
Formally: 

addEmpty {QTS) (G,G t )(s" >-» s'") = (AE,AH), where 

AS = { s "" £ g : s ' G $?«[»"] and /'" G 3«[a'"] and 

s £ s ' g G} 
AH = {s >-> q : s' G & e [s"] and s"" G ^ e [a'"] and 

s S s 'eG} 

u{ S '^ S '" :S 'G^ e [ S "]} 

u{ s "- s "" :s ""g^ e [ s '"]} 

U {»' ~ s"" : «' G &«[»"] and s"" G ^«[a'"]} • 

8.5 Termination and correctness 

Because the iteration function is no longer monotonic, we have to 
prove that a fixed point exists. It is trivial to show that the Dyck state 
graph component of the iteration-space ascends monotonically with 
each application; that is: 

Lemma 8.1. Given M G RPDS, G G DSG such that G C M, if 
J r '(M)(G,G i ,AG) = (G',G' e ,AG'), thenGC G'. 

Since the size of the Dyck state graph is bounded by the original 
pushdown system M, the Dyck state graph will eventually reach a 
fixed point. Once the Dyck state graph reaches a fixed point, both 
work-graphs/sets will be empty, and the e-closure graph will also 
stabilize. We can also show that this algorithm is correct: 



Proof. The proof is similar in structure to the previous one. 



□ 



8.6 Complexity: Still exponential, but more efficient 

As with the previous algorithm, to determine the complexity of 
this algorithm, we ask two questions: how many times would the 
algorithm invoke the iteration function in the worst case, and how 
much does each invocation cost in the worst case? The run-time of 
the algorithm is bounded by the size of the final Dyck state graph 
plus the size of the e-closure graph. Suppose the final Dyck state 
graph has m states. In the worst case, the iteration function adds 
only a single edge each time. There are at most 2|r|r7i +m edges 
in the Dyck state graph and at most m 2 edges in the e-closure graph, 
which bounds the number of iterations. 

Next, we must reason about the worst-case cost of adding an 
edge: how many edges might an individual iteration consider? In 
the worst case, the algorithm will consider every edge in every 
iteration, leading to an asymptotic time-complexity of: 

0((2|r|m 2 + 2m 2 ) 2 ) = 0(|r| 2 m 4 ). 

While still high, this is a an improvement upon the previous algo- 
rithm. For sparse Dyck state graphs, this is a reasonable algorithm. 

9. Polynomial-time complexity from widening 

In the previous section, we developed a more efficient fixed-point 
algorithm for computing a Dyck state graph. Even with the core 
improvements we made, the algorithm remained exponential in the 
worst case, owing to the fact that there could be an exponential 
number of reachable control states. When an abstract interpreta- 
tion is intolerably complex, the standard approach for reducing 
complexity and accelerating convergence is widening [Cousot and 
Cousot 1977]. (Of course, widening techniques trade away some 
precision to gain this speed.) It turns out that the small-step variants 
of finite-state CFAs are exponential without some sort of widening 
as well. 

To achieve polynomial time complexity for pushdown control- 
flow analysis requires the same two steps as the classical case: 
(1) widening the abstract interpretation to use a global, "single- 
threaded" store and (2) selecting a monovariant allocation function 
to collapse the abstract configuration-space. Widening eliminates a 
source of exponentiality in the size of the store; monovariance elim- 
inates a source of exponentiality from environments. In this section, 
we redevelop the pushdown control-flow analysis framework with 
a single-threaded store and calculate its complexity. 

9.1 Step 1: Refactor the concrete semantics 

First, consider defining the reachable states of the concrete seman- 
tics using fixed points. That is, let the system-space of the evalua- 
tion function be sets of configurations: 

C G System = V (Conf) = V (Exp x Env x Store x Kont). 

We can redefine the concrete evaluation function: 

£ (e) = lfp(/ e ), where f e : System — > System and 

f e (C) = {2(e)} U {c : c G C and c => c'} . 

9.2 Step 2: Refactor the abstract semantics 

We can take the same approach with the abstract evaluation func- 
tion, first redefining the abstract system-space: 

C G System = V (Conj\ 

= V I Exp x Env x Store x Kont ) , 



and then the abstract evaluation function: 

£(e) = lfp(/ e ), where f e : System — > System and 
f e (C) = |j(e)ju|c' : c£ C and c^ c\ . 

What we'd like to do is shrink the abstract system-space with a 
refactoring that corresponds to a widening. 

9.3 Step 3: Single- thread the abstract store 

We can approximate a set of abstract stores {<ri, . . . , <t„} with 
the least-upper-bound of those stores: at U ■ ■ ■ U o n . We can 
exploit this by creating a new abstract system space in which the 
store is factored out of every configuration. Thus, the system-space 
contains a set of partial configurations and a single global store: 

System = V ( PConf j x Store 

it £ PConf = Exp x Env x Kont. 

We can factor the store out of the abstract transition relation as well, 
so that (-**) C PCon~f x (PConf x Store): 

(e,p,k) -h> ((e',p',k'),a) iff (e,p,a,k) ~> (e , p ,o , k'), 

which gives us a new iteration function, /^ : System —¥ System , 

£(£,£) = (P',<x'), where 

P = ■; 7r' : 7r — > (-fr', <r") > U {ttq} 



a — \ < o : n —> (iv ,o 
tfto,0)=±(e). 



9.4 Step 4: Dyck state control-flow graphs 

Following the earlier Dyck state graph reformulation of the push- 
down system, we can reformulate the set of partial configurations as 
a Dyck state control-flow graph. A Dyck state control-flow graph 
is a frame-action-labeled graph over partial control states, and a 
partial control state is an expression paired with an environment: 

System" = DSCFG x StorZ 

DSCFG = V (PState) x V(PState x Fra~me ± x PState) 

i> G P State = Exp x Env. 

In a Dyck state control-flow graph, the partial control states are 
partial configurations which have dropped the continuation com- 
ponent; the continuations are encoded as paths through the graph. 

If we wanted to do so, we could define a new monotonic iter- 
ation function analogous to the simple fixed-point formulation of 
Section 7: 

f e : System — >• System , 

again using CFL-reachability to add pop edges at each step. 

A preliminary analysis of complexity Even without defining the 
system-space iteration function, we can ask, How many iterations 
will it take to reach a fixed point in the worst case? This question 
is really asking, How many edges can we add? And, How many 
entries are there in the store? Summing these together, we arrive at 
the worst-case number of iterations: 

DSCFG edges store entries 



With a monovariant allocation scheme that eliminates abstract en- 
vironments, the number of iterations ultimately reduces to: 

|Exp| x (2|Var| + 1) x | Exp | + |Var| x |Lam|, 

which means that, in the worst case, the algorithm makes a cubic 
number of iterations with respect to the size of the input program. 
The worst-case cost of the each iteration would be dominated 
by a CFL-reachability calculation, which, in the worst case, must 
consider every state and every edge: 

0(|Var| 3 x |Exp| 3 ). 

Thus, each iteration takes 0(n 6 ) and there are a maximum of 
0(n 3 ) iterations, where n is the size of the program. So,total 
complexity would be 0(n 9 ) for a monovariant pushdown control- 
flow analysis with this scheme, where n is again the size of the 
program. Although this algorithm is polynomial-time, we can do 
better. 

9.5 Step 5: Reintroduce e-closure graphs 

Replicating the evolution from Section 8 for this store-widened 
analysis, we arrive at a more efficient polynomial-time analysis. 
An e-closure graph in this setting is a set of pairs of store-less, 
continuation-less partial states: 



ECG = V I PState x PState). 
Then, we can set the system space to include e-closure graphs: 

System'" = DSC x ECG x Store. 

Before we redefine the iteration function, we need another fac- 
tored transition relation^ The j^tack- and action-factored transition 
relation (—rV) C PState x PState x Store determines if a tran- 
sition is possible under the specified store and stack-action: 

(e, p) z 7 (( e '> P)> a ') iff ( e > P> a > k ) ~> ( e '> P. 0"'i </> : k> ) 



(e, p) A ((e', p), o) iff (e, p,a,4>: k) ■ 

4>- 



(e , p , o- , k ) 



(e, p) -^ ((e', p), a') iff (e, p, ff, k) ~> (e , p, o, «'). 
Now, we can redefine the iteration function (Figure 4). 

Theorem 9.1. Pushdown OCFA can be computed in 0(n e )-time, 
where n is the size of the program. 

Proof. As before, the maximum number of iterations is cubic in 
the size of the program for a monovariant analysis. Fortunately, the 
cost of each iteration is also now bounded by the number of edges 
in the graph, which is also cubic. □ 

10. Applications 

Pushdown control-flow analysis offers more precise control-flow 
analysis results than the classical finite-state CFAs. Consequently, 
pushdown control-flow analysis improves flow-driven optimiza- 
tions (e.g., constant propagation, global register allocation, Min- 
ing [Shivers 1991]) by eliminating more of the false positives that 
block their application. 

The more compelling applications of pushdown control-flow 
analysis are those which are difficult to drive with classical control- 
flow analysis. Perhaps not surprisingly, the best examples of such 
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4 In computing the number of frames, we note that in every continuation, 
the variable and the expression uniquely determine each other based on 
the let-expression from which they both came. As a result, the number of 
abstract frames available in a monovariant analysis is bounded by both the 
number of variables and the number of expressions, i.e., \Frame\ = |Var|. 
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Figure 4. An e-closure graph-powered iteration function for push- 
down control-flow analysis with a single-threaded store. 



analyses are escape analysis and interprocedural dependence anal- 
ysis. Both of these analyses are limited by a static analyzer's abil- 
ity to reason about the stack, the core competency of pushdown 
control-flow analysis. (We leave an in-depth formulation and study 
of these analyses to future work.) 

10.1 Escape analysis 

In escape analysis, the objective is to determine whether a heap- 
allocated object is safely convertible into a stack-allocated object. 
In other words, the compiler is trying to figure out whether the 
frame in which an object is allocated outlasts the object itself. In 
higher-order languages, closures are candidates for escape analysis. 
Determining whether all closures over a particular A-term lam 
may be heap-allocated is straightforward: find the control states in 
the Dyck state graph in which closures over lam are being created, 
then find all control states reachable from these states over only e- 
edge and push-edge transitions. Call this set of control states the 
"safe" set. Now find all control states which are invoking a closure 
over lam. If any of these control states lies outside of the safe set, 
then stack-allocation may not be safe; if, however, all invocations 
lie within the safe set, then stack-allocation of the closure is safe. 

10.2 Interprocedural dependence analysis 

In interprocedural dependence analysis, the goal is to determine, for 
each A-term, the set of resources which it may read or write when 
it is called. Might and Prabhu showed that if one has knowledge 
of the program stack, then one can uncover interprocedural depen- 
dencies [Might and Prabhu 2009]. We can adapt that technique to 
work with Dyck state graphs. For each control state, find the set 
of reachable control states along only e-edges and pop-edges. The 



frames on the pop-edges determine the frames which could have 
been on the stack when in the control state. The frames that are live 
on the stack determine the procedures that are live on the stack. Ev- 
ery procedure that is live on the stack has a read-dependence on any 
resource being read in the control state, while every procedure that 
is live on the stack also has a write-dependence on any resource be- 
ing written in the control state. This logic is the direct complement 
of "if / calls g and g accesses a, then / also accesses a." 

11. Related work 

Pushdown control-flow analysis draws on work in higher-order 
control-flow analysis [Shivers 1991], abstract machines [Felleisen 
and Friedman 1987] and abstract interpretation [Cousot and Cousot 
1977], 

Context-free analysis of higher-order programs The closest re- 
lated work for this is Vardoulakis and Shivers very recent work 
on CFA2 [Vardoulakis and Shivers 2010]. CFA2 is a table-driven 
summarization algorithm that exploits the balanced nature of calls 
and returns to improve return-flow precision in a control-flow anal- 
ysis. Though CFA2 alludes to exploiting context-free languages, 
context-free languages are not explicit in its formulation in the 
same way that pushdown systems are in pushdown control-flow 
analysis. With respect to CFA2, pushdown control-flow analysis is 
polyvariant, covers direct-style, and the monovariant instatiation is 
lower in complexity (CFA2 is exponential-time). 

On the other hand, CFA2 distinguishes stack-allocated and 
store-allocated variable bindings, whereas our formulation of push- 
down control-flow analysis does not and allocates all bindings in 
the store. If CFA2 determines a binding can be allocated on the 
stack, that binding will enjoy added precision during the analysis 
and is not subject to merging like store-allocated bindings. 

Calculation approach to abstract interpretation Midtgaard and 
Jensen [2009] systematically calculate 0CFA using the Cousot- 
Cousot-style calculational approach [1999] to abstract interpreta- 
tion applied to an ANF A-calculus. Like the present work, Midt- 
gaard and Jensen start with the CESK machine of Flanagan et al. 
[1993] and employ a reachable-states model. The analysis is then 
constructed by composing well-known Galois connections to re- 
veal a 0CFA incorporating reachability. The abstract semantics ap- 
proximate the control stack component of the machine by its top 
element. The authors remark monomorphism materializes in two 
mappings: "one mapping all bindings to the same variable," the 
other "merging all calling contexts of the same function." Essen- 
tially, the pushdown 0CFA of Section 4 corresponds to Midtgaard 
and Jensen's analsysis when the latter mapping is omitted and the 
stack component of the machine is not abstracted. 

CFL- and pushdown-reachability techniques This work also 
draws on CFL- and pushdown-reachability analysis [Bouajjani 
et al. 1997, Kodumal and Aiken 2004, Reps 1998, Reps et al. 2005]. 
For instance, e-closure graphs, or equivalent variants thereof, ap- 
pear in many context-free-language and pushdown reachability al- 
gorithms. For the less efficient versions of our analyses, we implic- 
itly invoked these methods as subroutines. When we found these 
algorithms lacking (as with their enumeration of control states), we 
developed Dyck state graph construction. 

CFL-reachability techniques have also been used to compute 
classical finite-state abstraction CFAs [Melski and Reps 2000] 
and type-based polymorphic control-flow analysis [Rehof and 
Fahndrich 2001]. These analyses should not be confused with 
pushdown control-flow analysis, which is computing a fundamen- 
tally more precise kind of CFA. Moreover, Rehof and Fahndrich's 
method is cubic in the size of the typed program, but the types may 



be exponential in the size of the program. In addition, our technique 
is not restricted to typed programs. 

Model-checking higher-order recursion schemes There is ter- 
minology overlap with work by Kobayashi [2009] on model- 
checking higher-order programs with higher-order recursion schemes, 
which are a generalization of context-free grammars in which 
productions can take higher-order arguments, so that an order- 
scheme is a context-free grammar. Kobyashi exploits a re- 
sult by Ong [2006] which shows that model-checking these re- 
cursion schemes is decidable (but ELEMENTARY-complete) by 
transforming higher-order programs into higher-order recursion 
schemes. Given the generality of model-checking, Kobayashi's 
technique may be considered an alternate paradigm for the analysis 
of higher-order programs. For the case of order-0, both Kobayashi's 
technique and our own involve context-free languages, though ours 
is for control-flow analysis and his is for model-checking with re- 
spect to a temporal logic. After these surface similarities, the tech- 
niques diverge. Moreover, there does not seem to be a polynomial- 
time variant of Kobyashi's method. 

Other escape and dependence analyses We presented escape and 
dependence analyses to prove a point: that pushdown control-flow 
analysis is more powerful than classical control-flow analysis, in 
the sense that it can answer different kinds of questions. We have 
not yet compared our analyses with the myriad escape and depen- 
dence analyses {e.g., [Blanchet 1998]) that exist in the literature, 
though we do expect that, with their increased precision, our anal- 
yses will be strongly competitive. 

12. Conclusion 

Pushdown control-flow analysis is an alternative paradigm for 
the analysis of higher-order programs. By modeling the run-time 
program stack with the stack of a pushdown system, pushdown 
control-flow analysis precisely matches returns to their calls. We 
derived pushdown control-flow analysis as an abstract interpre- 
tation of a CESK machine in which its stack component is left 
unbounded. As this abstract interpretation ranged over an infi- 
nite state-space, we sought a decidable method for determining 
th reachable states. We found one by converting the abstracted 
CESK into a PDA that recognized the language of legal control- 
state sequences. By intersecting this language with a specific regu- 
lar language and checking non-emptiness, we were able to answer 
control-flow questions. From the PDA formulation, we refined the 
technique to reduce complexity from doubly exponential, to best- 
case exponential, to worst-case exponential, to polynomial. We 
ended with an efficient, polyvariant and precise framework. 

Future work Pushdown control-flow analysis exploits the fact 
that clients of static analyzers often need information about control 
states rather than stacks. Should clients require information about 
complete configurations — control states plus stacks — our analy- 
sis is lacking. Our framework represents configurations as paths 
through Dyck state graphs. Its results can provide a regular de- 
scription of the stack, but at a cost proportional to the size of the 
graph. For a client like abstract garbage collection, which would 
pay this cost for every edge added to the graph, this cost is unac- 
ceptable. Our future work will examine how to incrementally sum- 
marize stacks paired with each control state during the analysis. 
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